Scatter plots allow us to place points that let us see possible correlations between two features of a data set. Let's see how we can create them with ggplot!
We'll use the built-in mtcars dataset:
library('ggplot2')
df <- mtcars
head(df)
qplot(wt,mpg,data=df)
We can add a third feature by adding a color gradient on each point, or by resizing each point based on their value of this 3rd feature. For example:
qplot(wt,mpg,data=df,color=cyl)
qplot(wt,mpg,data=df,size=cyl)
qplot(wt,mpg,data=df,size=cyl,color=cyl)
# Show 4 features (this gets messy)
qplot(wt,mpg,data=df,size=cyl,color=hp,alpha=0.6)
Now let's see hwo to get more control by using ggplot():
pl <- ggplot(data=df,aes(x = wt,y=mpg))
pl + geom_point()
pl <- ggplot(data=df,aes(x = wt,y=mpg))
pl + geom_point(aes(color=cyl))
pl <- ggplot(data=df,aes(x = wt,y=mpg))
pl + geom_point(aes(color=factor(cyl)))
pl <- ggplot(data=df,aes(x = wt,y=mpg))
pl + geom_point(aes(size=factor(cyl)))
# With Shapes
pl <- ggplot(data=df,aes(x = wt,y=mpg))
pl + geom_point(aes(shape=factor(cyl)))
# Better version
# With Shapes
pl <- ggplot(data=df,aes(x = wt,y=mpg))
pl + geom_point(aes(shape=factor(cyl),color=factor(cyl)),size=4,alpha=0.6)
pl + geom_point(aes(colour = hp),size=4) + scale_colour_gradient(high='red',low = "blue")
Great! That's it for scatterplots, remember to reference the cheat sheet or the documentation for more details!